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DETAILED ACTION 

This action is in response to an amendment filed on December 15, 2006 for the 
application of Ballew et al., for a "System and method for detecting and managing HPC 
node failure" filed April 15, 2004. 
Claims 1-30 are pending in the application. 

Information disclosed and listed on PTO 1449 has been considered. 

Claim 11 has been amended. 

Claims 1-30 are rejected under 35 USC § 102. 

Claim Rejections - 35 USC § 101 

In response to the amendments to claim 1 1 , the last rejections have been 
withdrawn. 

Specification 

The use of the trademark HYPERTRANSPORT™ and INFINIBAND® has been 
noted in this application. It should be capitalized wherever it appears and be 
accompanied by the generic terminology. 

Although the use of trademarks is permissible in patent applications, the 
proprietary nature of the marks should be respected and every effort made to prevent 
their use in any manner which might adversely affect their validity as trademarks. 

Claim Rejections - 35 USC § 102 

The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 
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A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

Claims 1-30 are rejected under 35 U.S.C. 102(b) as being anticipated by Huang 
(U.S. Patent No. 5,748,882). 

As per claim 1, Huang discloses a method for managing HPC node failure (col. 4, 
lines 58-67). Figure 2 shows an example of a high performance node, which has at 
least one processor (col. 4, lines 66-67 through col. 5, lines 1-2), therefore each node 
could have multiple processors providing continuous availability hence high 
performance computing; comprising: 

determining that one of a plurality of HPC nodes has failed (col. 5, lines 15-19) 

each HPC node comprising an integrated fabric (col. 4, lines 66-67 through col. 
5, lines 1-2). Each node contains communication links, communication ports (col. 10, 
lines 65-67), (col. 11, lines 60-61), and the fault tolerance socket (col. 18, lines 28-31) 

removing the failed node from a virtual list of HPC nodes, the virtual list 
comprising one logical entry for each of the plurality of HPC nodes (col. 10, lines 45-50). 

As per claim 2, Huang discloses determining that at least a portion of an HPC job 
was being executed on the failed node (col. 7, lines 49-55) and terminating the HPC job 
(Fig. 5, element 511). 
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As per claim 3, Huang discloses determining that the HPC job was associated 
with a subset of the plurality of HPC nodes; and deallocating the subset of HPC nodes 
(col. 7, lines 5-67 through col. 8, lines 1-9). 

As per claim 4, Huang discloses each entry of the virtual list comprising a node 
status and the method further comprising changing the status of each of the subset of 
HPC nodes to "available" (col. 10, lines 50-56). 

As per claim 5, Huang discloses determining dimensions of the terminated job 
based on one or more job parameters and an associated policy; dynamically allocating 
a second subset of the plurality of HPC nodes based on the determined dimensions 
(col. 17, lines 1-21) 

executing the terminated job on the allocated second subset (col. 7, lines 5-67 
through col. 8, lines 1-9). 

As per claim 6, Huang discloses the second subset comprising a substantially 
similar set of nodes to the first subset (Fig. 2). 

As per claim 7, Huang discloses dynamically allocating the second subset 
comprises: determining an optimum subset of nodes from a topology of unallocated 
HPC nodes; and allocating the optimum subset (col. 7, lines 5-67 through col. 8, lines 1- 
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As per claim 8, Huang discloses locating a replacement HPC node for the failed 
HPC node; and updating the logical entry of the failed HPC node with information on the 
replacement HPC node (col. 7, lines 5-67 through col. 8, lines 1-9). 

As per claim 9, Huang discloses determining one of the plurality of HPC nodes 
has failed comprises determining that a repeating communication has not been received 
from the failed node (col. 17, lines 21-30). 

As per claim 10, Huang discloses determining one of the plurality of HPC nodes 
has failed is accomplished through polling (col. 8, lines 43-63). 

As per claim 11, Huang discloses software for managing HPC node failure (col. 

4, lines 58-67). Figure 2 shows an example of a high performance node, which has at 
least one processor (col. 4, lines 66-67 through col. 5, lines 1-2), therefore each node 
could have multiple processors providing continuous availability hence high 
performance computing; operable to: 

determine that one of a plurality of HPC nodes has failed (col. 5, lines 15-19) 
each HPC node comprising an integrated fabric (col. 4, lines 66-67 through col. 

5, lines 1-2). Each node contains communication links, communication ports (col. 10, 
lines 65-67), (col. 11, lines 60-61), and the fault tolerance socket (col. 18, lines 28-31) 
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remove the failed node from a virtual list of HPC nodes, the virtual list comprising 
one logical entry for each of the plurality of HPC nodes (col. 10, lines 45-50). 

As per claim 12, Huang discloses to determine that at least a portion of an HPC 
job was being executed on the failed node (col. 7, lines 49-55) and terminating the HPC 
job (Fig. 5, element 511). 

As per claim 13, Huang discloses to determine that the HPC job was associated 
with a subset of the plurality of HPC nodes; and deallocate the subset of HPC nodes 
(col. 7, lines 5-67 through col. 8, lines 1-9). 

As per claim 14, Huang discloses each entry of the virtual list comprising a node 
status and the software further operable to change the status of each of the subset of 
HPC nodes to "available" (col. 10, lines 50-56). 

As per claim 15, Huang discloses to determine dimensions of the terminated job 
based on one or more job parameters and an associated policy; dynamically allocate a 
second subset of the plurality of HPC nodes based on the determined dimensions (col. 
17, lines 1-21) 

executing the terminated job on the allocated second subset (col. 7, lines 5-67 
through col. 8, lines 1-9). 
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As per claim 16, Huang discloses the second subset comprising a substantially 
similar set of nodes to the first subset (Fig. 2). 

As per claim 17, Huang discloses the software operable to dynamically allocate 
the second subset comprises software operable to: determine an optimum subset of 
nodes from a topology of unallocated HPC nodes; and allocate the optimum subset (col. 
7, lines 5-67 through col. 8, lines 1-9). 

As per claim 18, Huang discloses to locate a replacement HPC node for the 
failed HPC node; and update the logical entry of the failed HPC node with information 
on the replacement HPC node (col. 7, lines 5-67 through col. 8, lines 1-9). 

As per claim 19, Huang discloses the software operable to determine one of the 
plurality of HPC nodes has failed comprises software operable to determine that a 
repeating communication has not been received from the failed node (col. 17, lines 21- 

30). 

As per claim 20, Huang discloses the software operable to determine one of the 
plurality of HPC nodes has failed is accomplished through polling (col. 8, lines 43-63). 

As per claim 21 , Huang discloses a system for managing HPC node failure (col. 
4, lines 58-67). Figure 2 shows an example of a high performance node, which has at 
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least one processor (col. 4, lines 66-67 through col. 5, lines 1-2), therefore each node 
could have multiple processors providing continuous availability hence high 
performance computing; comprising: 

a plurality of HPC nodes (Fig. 2) 

a management node (Fig. 2, element 104) operable to: 

determine that one of the plurality of HPC nodes has failed (col. 5, lines 15-19) 

each HPC node comprising an integrated fabric (col. 4, lines 66-67 through col. 
5, lines 1-2). Each node contains communication links, communication ports (col. 10, 
lines 65-67), (col. 11, lines 60-61), and the fault tolerance socket (col. 18, lines 28-31) 

remove the failed node from a virtual list of HPC nodes, the virtual list comprising 
one logical entry for each of the plurality of HPC nodes (col. 10, lines 45-50). 

As per claim 22, Huang discloses the management node further operable to: 
determine that at least a portion of an HPC job was being executed on the failed node 
(col. 7, lines 49-55) and terminating the HPC job (Fig. 5, element 511). 

As per claim 23, Huang discloses the management node further operable to: 
determine that the HPC job was associated with a subset of the plurality of HPC nodes; 
and deallocate the subset of HPC nodes (col. 7, lines 5-67 through col. 8, lines 1-9). 

As per claim 24, Huang discloses each entry of the virtual list comprising a node 
status and the management node further operable to change the status of each of the 
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subset of HPC nodes to "available" (col. 10, lines 50-56). 

As per claim 25, Huang discloses the management node further operable to: 
determine dimensions of the terminated job based on one or more job parameters and 
an associated policy; dynamically allocate a second subset of the plurality of HPC 
nodes based on the determined dimensions (col. 17, lines 1-21) 

executing the terminated job on the allocated second subset (col. 7, lines 5-67 
through col. 8, lines 1-9). 

As per claim 26, Huang discloses the second subset comprising a substantially 
similar set of nodes to the first subset (Fig. 2). 

As per claim 27, Huang discloses the management node operable to dynamically 
allocate the second subset comprises the management node operable to: determine an 
optimum subset of nodes from a topology of unallocated HPC nodes; and allocate the 
optimum subset (col. 7, lines 5-67 through col. 8, lines 1-9). 

As per claim 28, Huang discloses the management node further operable to: 
locate a replacement HPC node for the failed HPC node; and update the logical entry of 
the failed HPC node with information on the replacement HPC node (col. 7, lines 5-67 
through col. 8, lines 1-9). 
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As per claim 29, Huang discloses the management node operable to determine 
one of the plurality of HPC nodes has failed comprises the management node operable 
to determine that a repeating communication has not been received from the failed 
node (col. 17, lines 21-30). 

As per claim 30, Huang discloses the management node operable to determine 
one of the plurality of HPC nodes has failed is accomplished through polling (col. 8, 
lines 43-63). 

Related Prior Art 

The following prior art is considered to be pertinent to applicant's invention, but 
nor relied upon for claim analysis conducted above. 

Block et al. (U.S. Patent No. 6,918,051), "Node shutdown in clustered computer 
system". 

Dervin et al. (U.S. Patent No. 6,952,766), "Automated node restart in clustered 
computer system". 

Ho et al. (U.S. Patent No. 6,918,063), "System and method for fault tolerance in 
multi-node system". 

Response to Arguments 

Applicant's arguments see pages 7-13, filed December 15, 2006 with respect to 
the rejection(s) of claim(s) 1-30 under 35 USC § 103 have been fully considered and 
are persuasive. Therefore, the rejection has been withdrawn. However, upon further 
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consideration, a new ground(s) of rejection is made over Huang (U.S. Patent No. 
5,748,882). Refer to the corresponding section of the claim analysis for details. 



Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Elmira Mehrmanesh whose telephone number is (571) 

272- 5531. The examiner can normally be reached on 8-5 M-F. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Robert W. Beausoliel can be reached on (571) 272-3645. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 

273- 8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 



Conclusion 





